Genome diversity and evolutionary characteristics of clinical isolates of Bordetella pertussis circulating in Iran.

Background and Objectives: The re-emergence of pertussis still is being reported all over the world. Pathogen adaptation and antigenic divergence of circulating isolates from vaccine strains are the main reasons of infection resurgence. Waning immunity is also an important factor contributing to resurgence of pertussis. Materials and Methods: The genetic diversity and evolutionary characteristics of circulating Iranian isolates of Bordetella pertussis during February 2015 to October 2018 was investigated by pulsed-field gel electrophoresis (PFGE) and subsequently ptxA, ptxP and fim3 alleles were characterized. The next generation genome sequencing was then used to compare the genomics of ptxP1 and ptxP3 of selected isolates from PFGE dendrogram. Results: PFGE differentiated 62 clinical isolates and vaccine and reference strains into 19 PFGE profiles, indicating the higher level of heterogeneity in the population during 2015–2018. The predominant B. pertussis genotype harbored pertussis toxin promoter allele, ptxP3 and the expansion of ptxA1 isolates, were also observed in our population. Conclusion: No changes in allelic profile of predominant clone in recent years was observed but antigenic divergence between recently circulating isolates and the vaccine strain has been progressed and significantly was higher than previous studies. The comparative genomic analysis of the ptxP3 and ptxP1 isolates indicate that changes in ptxP3 genome structure including 32 unique SNPs and three unique indels may have contributed to the expansion of the ptxP3 clone. We compared ptxP3 and ptxP1 isolates in pathogenicity-associated genes and found five of them were specific for the ptxP3 isolates. The polymorphisms in pathogenicity-associated genes suggest structural adaptations for these virulence factors.


INTRODUCTION
Whooping cough (pertussis) is a highly contagious, acute respiratory illness of humans that is caused by the Gram-negative bacterium Bordetella pertussis. The infection can be life threatening in in-fants, who are too young to be vaccinated or are not yet fully vaccinated (1).
Introduction of whole cell vaccine (WCV) during the 1950s significantly reduced the morbidity and mortality of pertussis. However, due to the side effect of WCV, acellular vaccine (ACV) was developed in the 1980s (2). Although pertussis is relatively well controlled by extensive vaccination programs, re-emergence of pertussis has been reported in many countries with high vaccination coverage, including the European countries, the United States and Australia (3)(4)(5)(6). Pathogen adaptation and antigenic divergence of circulating isolates from vaccine strains are the main reasons of infection resurgence. Waning immunity is also an important factor contributing to resurgence of pertussis (7,8).
While a vast majority of developed countries switched from WCV to ACV, whole cell-based combination vaccines is still in use in developing countries (9). In Iran, the whole-cell pertussis vaccine was introduced for children in 1950s and continued until now. Children are immunized with three doses of diphtheria-tetanus, whole-cell pertussis (DTwP) vaccine at 2, 4 and 6 month. Then, young children receive two booster to maintain that protection through in 18 month and 6-years old (10). Despite high pertussis vaccination coverage in Iran, (96% since 2000) pertussis incidence is still the highest amongst all vaccine-preventable diseases. Our population has experienced pertussis resurgence since 2007 and reached its peak in 2012 and 2013 (11).
Recently, antigenic divergence has been reported in different countries. Molecular studies are important for monitoring the spread of antigenic divergence between clinical isolates and vaccine strain (12). B. pertussis has conventionally typed by pulsed-field gel electrophoresis (PFGE). This method has been frequently used worldwide to provide laboratory data for characterization of B. pertussis isolates (13). PFGE achieves some level of resolution that resulted in the high structural dynamics of B. pertussis genome (14,15). On the other hand, antigenic divergence of B. pertussis can be easily achieved by point mutation. Genotyping of pathogenicity-associated genes is very precious to find the polymorphism of these genes. Comparative analysis of B. pertussis genomes by whole genome sequencing is important to study B. pertussis population diversity (16,17).
The aims of this study, was to investigate the genetic diversity of recent circulating clinical B. pertussis isolates, using PFGE and genotyping of pathogenicity-associated genes during February 2015 to October 2018. Finally, we selected three ptxP1 isolates and three ptxP3 isolates from PFGE dendrogram for comparative genomics of ptxP1 and ptxP3 isolates, to know the cause of the ptxP3 isolates outbreak in our population using next generation genome sequencing.

MATeRIALS AND MeTHODS
Bacterial isolates. In the current study, 62 clinical isolates were obtained from Pertussis Reference Laboratory of Pasteur Institute of Iran during February 2015 to October 2018. Isolates collected from different province of Iran including Tehran, Esfahan, Khorasan, East and West Azarbaijan, Khuzestan. Bp134, Bp509 as vaccine isolates and Bp18323 (ATCC 9797), Tohama I as reference strains, were used for validation.
Bacterial isolates were re-cultured on Regan-Lowe medium containing charcoal agar and 10% defibrinated sheep blood, and incubated at 37 °C for 72 h. B. pertussis isolates were confirmed by a combination of colony morphology, growth rate, Gram stain and conventional biochemical tests such as oxidase and use of specific Bordetella antiserums (Difco B. pertussis Antiserum, Rabbit serum for slide agglutination). DNA extraction was performed using High Pure PCR Template Preparation Kit (Roche Diagnostics GmbH, Mannheim, Germany). Real-time PCR was performed by targeting IS481, ptxP, IS1001 and IS1002 for species confirmation of B. pertussis. Primers and probes sequences are available in Table 1 (18).
Genotypic analysis. The polymorphism of pathogenicity-associated genes including ptxA, ptxP and fim3 was determined by DNA sequencing. PCR was performed with specific primers as described previously (19)(20)(21)(22). PCR products were purified and sent for Sanger sequencing (Macrogen, South Korea). The sequences were read by using MEGA5 software in conjunction with reference strains Tohama I (23).
Pulsed-field gel electrophoresis. PFGE profiles were obtained using the XbaI enzyme (Fermentas, ABI, and Germany). The restriction fragments were separated using a CHEF-DRIII system (Bio-Rad), with initial switch time: 5s, final switch: 45 s and run time: 21 h. Fragments patterns were analyzed with . Salmonella enterica serotype Braenderup strain H9812 was used as size marker. Dendrogram was drawn by unweighted-pair group method using the arithmetic average algorithm (UPGMA) with 2% band tolerance and 2% optimization settings with the Dice coefficient.
Bacterial selection and whole genome sequencing. Six isolates were selected based on the hospitalized patients, mortality and the results of PFGE dendrogram in present and our previous study (24). We choose three ptxP1 isolates and three ptxP3 isolates for comparative genomics analysis. Details of clinical isolates are shown in Table 3.
Genomic DNA was extracted and purified from pure culture using phenol-chloroform method (25). DNA libraries were constructed using Nextera XT kit (Illumina, San Diego, USA) according to manufacturer's protocol and sequenced on the Illumina NextSeq instrument using 2 × 150 bp paired-end protocol. The contigs were aligned to the reference B. pertussis strain Tohama I (Gene Bank accession number BX470248) using progressive Mauve (version 2.3.1) and the de novo assembly of the raw reads were performed using SPAdes version 3.13.0 (26,27). To extract Single nucleotide polymorphism (SNPs) a combination of Burrows-Wheeler Alignment (BWA) tools (version 0.7.5), Samtools (version 0.1.19) and progressive Mauve (28,29) were used. Briefly, the filtered SNPs from mapping were compared to the SNPs exported by progressive Mauve and final SNPs were selected for further analysis. Distribution of short insertions/deletions (indels), which are less than 100 bp, were also identified using SAMtools. The nucleotide sequences have been deposited under bio sample accession number PRJ-NA600651. The version described in this paper is the first version.
PFGe analysis of Iranian B. pertussis isolates from 2015 to 2018. PFGE was highly discriminatory between our isolates, and identified 19 distinct PFGE profiles, named BP 1 -BP 19 (Fig. 1). All profiles were closely related to each other and based on similarity higher than 95%, they were classified into 14 major clades (A-N). Four profiles consisted of closely related isolates (BP 1 -BP 4 ). The major profile was clade A, which consisted of 41 isolates and characterized by ptxA1/ptxP3/fim3-2. BP2 as a major group among isolates, comprised 31 isolates that classified in clade A.
Clade H indicated the same profile as the Iranian vaccine seed strain Bp134 with a little diversity in ptxA antigen (due to point mutation). In clade I, only one isolate, grouped with reference strain BP18323 (9797) with the same PFGE pattern. This strain (strain no. IR142) was isolated in 2015 with the simi-lar allelic profile to B. pertussis 18323 (ptxA5/ptxP4/ fim3-1) and had low similarity with all clinical isolates tested. Finally, we found no correlation between year and state of isolation in PFGE dendrogram.
Whole genome sequencing and identification of single nucleotide polymorphisms. In this study, we illustrated comparative genomics of ptxP1 and ptxP3 isolates (Table 3). We calculated an approximate genome size for each isolate using qualimap. The average genome size of our clinical isolates is 3, 885, 151 bp. A total of 366 SNPs was identified, of which 102 SNPs were in intergenic (IG) region and 264 located in genes, while 149 SNPs were non-synonymous (nsSNPs), 115 SNPs were synonymous (sSNPs).
Twenty SNPs were common in all isolates, among them, eight were intergenic (IG) and 15 located in genes. The nsSNPs located in BP1660 (sphB2), found in all isolates, is an auto transporter and involved in pertussis pathogenicity. PtxP3 isolates differentiated from ptxP1 isolates by 32 unique SNPs, among them seven were intergenic and 24 SNPs were in coding regions (Table 4). There is one ptxP3 isolate (IR175), carrying prn9 allele. Thirteen unique SNPs were detected related to prn9 isolate with ptxA1/ptxp3/ fim3-2/fim2-1 antigenic profile. The main nsSNPs mutations located in BP1226 (conserved hypothetical), BP2548 (regulation) and BP3405, nsSNPs in BP3405 (ompQ) were only in prn9 Iranian isolates.
We compared ptxP3 and ptxP1 isolates in pathogenicity-associated genes, we found five of them were specific for the ptxP3 isolates (Fig. 2). Polymorphisms were found in pathogenicity-associated genes suggesting structural adaptations for these virulence factors.
On the other hand, the SNP density (the number of SNPs per bp of gene in each category) of ptxP3 and ptxP1 isolates was calculated. The overall SNP density of the whole genomes were, 0.0008 SNPs/bp with uses of functional categories defined by Parkhill et al. (30). The highest SNP density were in ribosome constituents (0.002SNPs/bp), phage-related or transposon-related (0.0014SNPs/bp) and regulation (0.0011SNPs/bp). The main SNP density differences between ptxP3 and ptxP1 isolates in cell processes and ribosome constituents that shows in Fig. 3.
Finally, a total of 40 indels were found in our isolates, 24 deletion and 16 insertion. Twenty-four indels were in genes and 16 were intergenic (IG). Among indels located in genes, 19 were frameshifts, of which 10 resulted from single base pair (bp) indels, two from three bp indels and six each from 4, 7, 8 and 31 bp indels. Two frameshifts indels located in BP2232, BP2928 had a stop codon resulting in proteins that were shorter than expected which were considered as pseudogenes (30). Three indels were unique in ptxP3, all as frameshift mutations. BP0880 was a pseudogene encoding for a putative exported protein and BP1054 codes prn precursor that cause prn new allele production (Table 5).

DISCUSSION
When the first pertussis surveillance system has started monitoring whooping cough in our country, predominant profile expresses ptxP3/ptxA1/fim3-2.
In previous study carried out by our team analyzed the alle typing and PFGE profile of clinical B. pertussis isolates collected during 2008-2015. In this study we continued PFGE typing of new isolates collected during 2015-2018. In previous study, Heravi       et al. showed that the majority of current circulating B. pertussis isolates in Iran are ptxP3 strains following the worldwide pattern (24). In the current study, there was no changes in allelic profile of predomi-nant clone in recent years but antigenic divergence between recently circulating isolates and the vaccine strain has progressed and significantly is higher than previous years. We found that almost all circulating isolates harbored ptxP3, ptxA1 and fim3-2 alleles, while our vaccine seed isolates Bp134 shows ptxA2/ ptxP1/fim3-1 alleleic profile and vaccine seed isolates Bp509 shows ptxA4/ptxP2/fim3-1. Divergences between the vaccine isolates and PFGE predominant clade are clear and may have effect on vaccine efficacy.
In the present study, we used PFGE, as a tool to monitor the pertussis population. Nineteen distinct PFGE profiles from phylogenetic analysis of Iranian pertussis population indicated the higher level of heterogeneity in the population during 2015-2018. Some PFGE patterns of currently predominant B. pertussis clones were found within PFGE groups BP2 that classified in clade A, this predominant clonal type represent a single profile identical to the European profile BpSR11, which is part of PFGE group IV (31,32). This profile comprised 10 to 50% of the isolates, from all European countries (except Poland) and characterized to possess ptxA1/ptxP3/fim3-2 (32)(33)(34). Similar PFGE profiles were observed in Canada with 82% of the isolates represented, ptxA1/ptxP3/fim3-2 and belonged to PFGE group IV (35). In these countries vaccinations with WCV were started 50 to 60 years ago and switched to ACV 10 to 20 couple of years ago (32,35,36). Iranian predominant profiles are different from other countries using whole cell vaccines such as Poland. The replacement of the B. pertussis population in Iran by Clade A with ptxP3/ ptxA1 profile, where only WCV has been used, suggests that the expansion of ptxP3 strains was not driven by ACV selection alone. In Poland where WCV has also been in use for decades, in 2000-2013, ptxP1 isolates predominated with 77.5%. However, in a recent study in Poland during 2010-2016, 61.5% of the isolates were found to be ptxP3 and were attributed to the increasing use of ACV since 2013 for primary immunization. (37,38). We conclude that, the rate of predominant allelic profiles not necessarily related to the type of population vaccine.
In the last ten years two ptxP alleles, ptxP1 and ptxP3, were predominated among the Iranian populations of B. pertussis. The ptxP3 isolates were observed in 2005 for the first time and then gradually increased in frequency and now in recent years has been replaced by ptxP1 isolates, while we had only 16.7% ptxP1 among our isolates. Predominant ptxP3 profile appears to have infection-associated biologic characteristics such as higher Ptx production and enhanced respiratory colonization which contributed to the pertussis epidemic that differ from those of ptxP1 isolates (21,39,40).
Recently, whole-genome sequencing has been used to identify genetic changes in the B. pertussis population. For better understanding of ptxP3 and ptxP1 genomic diversity, we compared whole genome sequences of ptxP3 genomes and ptxP1 isolates. We confirmed that ptxP3 isolates differed from ptxP1 isolates in genomics. There are 32 unique SNPs that differentiated the ptxP3 isolates from ptxP1 isolates. The most important mutations in the virulence genes unique for ptxP3 strains that occurs in prn, ptxC and fim3, responsible for changes in alleles of the genes encoding vaccine antigens and may be the cause of ptxP3 lineage expansion in our population. Our results showed that ptxP3 genomic profile is similar to those collected worldwild due to the better genome fitness of ptxP3 isolates, the most important mutations in the virulence genes unique for ptxP3 isolates that occurs in sphB1, prn, ptxC, bscI and fim3, responsible for changes in alleles of the genes encoding vaccine antigens and may be the cause of ptxP3 lineage expansion (40,41). BP2249 (bscI), BP1568 (fim3), that maybe adaptive and due to the emergence of a new allele, nsSNPs in fim3 was involved in antigenic shift from fim3-1 to fim3-2 (5).
On the other hand, ptxP1, contains isolates which represent ptxA1/ptxP1 and ptxA2/ptxP1. PtxA1/ptxP1 was distinguished by 35 unique SNPs from ptxA2/ ptxP1 isolates. The main non-synonymous mutations found in ptxP1/ptxA1 profile located in BP3812 and BP3869 are involved in cell surface, BP1568 encoding fim3 involve in pathogenicity, and BP3137, putative two-component system sensor protein involved in regulation.
In conclusion, currently in Iran the surveillance system is affected by many limitations. The underestimation of pertussis in adolescents and adults is mainly related to the atypical clinical characteristics of cases and lack of lab confirmation, the age group is considered as the reservoir of infection.
Epidemiological results of sequencing and PFGE is very helpful to control and prevention of pertussis. However, Iran has not experienced pertussis epidemic after 2013, but our finding suggest the fact that the presence of non-vaccine ptxP3/ptxA1, may associate with rising trend of pertussis among populations since 2007. The comparative genomic analysis of the ptxP3 and ptxP1 isolates provided new insights into the evolution of pertussis isolates in Iran.
Our findings illustrate that changes in ptxP3 genome structure including 32 unique SNPs and three unique indels may have contributed to the expansion of the ptxP3 clone. Furthermore, this is the first study about comparative genome analysis of ptxP1 and ptxP3 allele type with next generation sequencing, in Iran. These results could be impressive in controlling of infection and help policy makers in the healthcare and community settings but there is a need to perform more sequence of ptxP3 isolates in order to obtain a better picture of the pathogen adaptation under vaccine pressure.

ACKNOWLeDGeMeNTS
This study was supported by institute Pasteur of Iran (Grant No: B-9112). We would also like to show our gratitude to all the staff in Pertussis Reference Laboratory of institute Pasteur of Iran. We would like to express our special thanks to Prof. Ruiting Lan at UNSW for helping the analysis process.